A random subspace ensemble classification model for discrimination of power quality events in solar PV microgrid power network

This study proposes SVM based Random Subspace (RS) ensemble classifier to discriminate different Power Quality Events (PQEs) in a photovoltaic (PV) connected Microgrid (MG) model. The MG model is developed and simulated with the presence of different PQEs (voltage and harmonic related signals and distinctive transients) in both on-grid and off-grid modes of MG network, respectively. In the pre-stage of classification, the features are extracted from numerous PQE signals by Discrete Wavelet Transform (DWT) analysis, and the extracted features are used to learn the classifiers at the final stage. In this study, first three Kernel types of SVM classifiers (Linear, Quadratic, and Cubic) are used to predict the different PQEs. Among the results that Cubic kernel SVM classifier offers higher accuracy and better performance than other kernel types (Linear and Quadradic). Further, to enhance the accuracy of SVM classifiers, a SVM based RS ensemble model is proposed and its effectiveness is verified with the results of kernel based SVM classifiers under the standard test condition (STC) and varying solar irradiance of PV in real time. From the final results, it can be concluded that the proposed method is more robust and offers superior performance with higher accuracy of classification than kernel based SVM classifiers.


Introduction
Microgrid (MG) generally provides reliable, economic, and secured energy supply to the critical loads and remote areas of communities, with following additional features: promotes demand side management; low carbon emission of energy supply; accommodates multiple generating options from different types of Distributed Generation (DG) sources, and so on [1]. It is a major challenge to maintain the quality of energy supply in the MG network while SVM with different kinds of kernel functions can be used to enhance the classifier performance while solving the non-linear nature of classification problems in PQ study [26]. The kernel function can transform the inseparable data from a small dimensional area to a large dimensional area where the information can be separated more accurately. The different types of kernel functions of SVM include linear kernel, polynomial kernel, and Gaussian kernel (Radial Basis Function), etc. [27]. Biswal et al. [28] proposed a multiclass SVM using linear kernel function with a combination of disturbances versus normal (DVN) approach of feature extraction for classifying complex PQEs in power systems. Radial Basis Function (RBF) and polynomial kernel-based SVM were introduced by the authors of [29] in a hybrid DG environment of a power system network. Similarly, the authors in [30] utilised SVM with RBF based kernel to detect the disturbance patterns in the three-phase simulated signals. Most of the intelligent classifiers, like ANN, PNN, NB, KNN, SVM, and different kernels of SVM, are stated in literature to have their own strengths and weaknesses. For enhancing the precision and generalisation ability of individual weak learners, several ensemble classifiers are used by the researchers. Ensemble classifiers are mainly used to improve the overall performance and stability of weak classifiers through computing their output predictions in different ways [31].
From several research studies, it can be proven that the ensemble approach to classification offers promising results of accuracy compared to individual weak classifiers. Several ensemble classifiers have been used by researchers to discriminate between different PQEs in conventional and RE integrated power system networks. The Bagging ensemble classifier with the flexible analytic wavelet transform (FAWT) method in [32] is applied to discriminate multiple PQEs in RE connected power networks with promising results compared to individual weak classifiers. The S-Transform extraction method with Adaboost ensemble approach [33] and Hilbert Huang Transform feature extraction with adaptive NFS [34] have been used for PQ analysis with achievement of higher accuracy and better performance than single classifiers. Furthermore, DWT analysis with voting approach in [35] and stacking ensemble approach in [36] have shown better effectiveness in predicting various PQEs in the PV integrated power network. Similarly, to improve the classification accuracy and robustness of individual weak classifiers, the authors in [37] used Random Forest classifier for discrimination of multiple PQ signals in RE connected power network. Thus, it is clear from the literature of the ensemble approach that ensemble models can significantly improve the overall accuracy and generalisation ability of weak classifiers. Hence, in this study, SVM based Random Subspace (RS) ensemble classifier is proposed to discriminate against different PQEs in the MG network. The structure of classifiers used in the RS ensemble method is constructed with different subsets of features which are sampled randomly from the main data set [38]. Because of using randomly selected subset features, the RS method can provide low bias risk with enhancement of prediction performance for the weak classifiers. The RS method also offers superior performance when the training data set has redundant features [39][40][41].
In most of the research works [2,4,23,24,42,43], the PQ analysis in the MG network was carried out either in the on-grid or off-grid (islanded) mode of MG operation. However, to ensure reliable operation and improved PQ of MG network, it is necessary to discriminate between the PQEs in both the on-grid and off-grid modes of MG operation. Also, limited research work was observed with the analysis of PQEs in MG network using an ensemble approach of classification in MG network under the weather intermittence condition of RE sources. Hence, in this study, discrimination of different PQEs is considered in both modes (on-grid and off-grid) of the PV connected MG network under variation of solar PV irradiance with real time condition. From the final results of classification analysis, it is inferred that the proposed SVM based RS ensemble classification model outperforms different types of kernel based SVM classifiers (Linear, Quadratic, and Cubic) in terms of classification accuracy and performance level. The important objectives of this research study are listed below: This article is structured as follows: Section 2 explains the definition of MG simulation model with addition of various PQEs, Section 3 describes the concept of the classification framework model, Section 4 presents the detail of data acquisition and signal processing method, Section 5 describes the SVM classifier concept with various kinds of kernel functions (Linear, Quadratic, and Cubic) and the proposed RS ensemble classification model, Section 6 discusses the results analysis based on the classification and performance analysis of the proposed RS ensemble classification model and kernel-based SVM classifiers, Section 7 describes the comparative analysis, and the outcomes with future scope of this study are concluded in section 8.

Overview of MG model
The Matlab-Simulink software tool is used to develop a PV integrated MG simulation model. The MG model is simulated with the introduction of different PQE's (normal, voltage sag & swell, harmonic distortions, and transients (due to switching of capacitor bank, PV inverter, and LG fault) for analysis. The configurations on-grid and off-grid MG models are portrayed in Fig 1A and 1B, respectively. The MG model includes different types of Distribution Generation (DG) sources (Solar Photovoltaic (PV) and diesel powered genset)) and loads (linear and non-linear). Also, the MG model includes with 25 kV feeder lines with a length of 2 km (each). Details of the power ratings of all elements used in the MG network are shown in Table 1.

Description of different PQEs
During PQ analysis, it is considered to follow the threshold limits (as per IEEE 1159 standard [44] of different PQEs in the MG network. Normal and the three most common voltagerelated PQEs (sag, swell, and harmonic distortions) are generated by switching heavy (sag/ swell) and non-linear (harmonics) loads in the MG network's off-grid mode. Furthermore, three PQ transients have been generated by switching of capacitor bank (transients-1), PV inverter (transients-2), and ground fault-LG (transients-3) for both modes (on-grid and offgrid) of the MG network. The PQEs with corresponding switching actions are listed in Table 2.

DWT method of feature extraction
The Wavelet transform analysis method is one of the most effective methods for decomposing a fast varying signal into numerous sub-components in time and frequency domains [45]. The WT variants are often available in the form of continuous and discrete variants. Continuous wavelet transform (CWT) can be used to address the resolution constraint in STFT, but in the case of real-time applications, it is less beneficial and has low rpetition. The discrete method of

PLOS ONE
wavelet transform (DWT) can be used to nullify the drawbacks of CWT and mathematically can be defined as [46,47], where a m 0 is the scaling factor, nb 0 a m 0 is the translation factor, m and n are the representation of integers, X(n) is the time signal, and f is the function of the mother wavelet.
Multi Resolution Analysis (MRA) is typically used for the DWT process to get wavelet transform coefficients (detail and approximate) through decomposing of the input signal. MRA is more appropriate for decomposing the PQE signals, because it has the characteristics of using less memory and simple implementation. In this process, a series of filter banks are used at each point of decomposition to decompose the signals at different resolutions. Fig 3  indicates the decomposition of the test signal up to the second stage. The input signal V(n) is passed through a collection of high-pass (g1) and low-pass (h1) filters to obtain the detailed (D1, D2) and approximate coefficients (A1, A2). In addition, the signal is downscaled by a factor of two at each step and the approximation of coefficient is used for further decomposition. This decomposition process is carried on till the specified decomposition level is reached [46,47].

PLOS ONE
From the below given Eqs (2) and (3), that detail (Di) and approximation (Ai) coefficients can be evaluated: where D i and A i are the coefficients of detail and approximation, respectively at the i th level. where D i and A i denote the detail and approximation coefficients, respectively, at the level of i th . High-pass and low-pass filters are associated with the wavelet ω (t) and scaling β (t) functions and can be expressed as, bðtÞ ¼ ffi ffi ffi 2 p X n h1ðnÞ bð2t À nÞ ð5Þ The mother wavelet of Daubechies-4 (db4) is commonly used in PQ analysis to detect fast transient signals in the power system, according to the literature [48,49]. Therefore, in this research work, mother wavelet of db4 is considered for analysis of PQE signals.

Evaluation of energy value
The feature extraction is useful in such a way as to provide reduced dimension of the input vector matrix with useful information for the classifiers. Using the Eq (6) [49], the energy value (EV) can be estimated from the detail coefficients of DWT analysis.

Materials and classification methods
In this research, the software tool WEKA is utilised to discriminate between different PQEs in the MG network using extracted features. The WEKA is an effective tool with inclusion of several classification algorithms and the option of providing base and ensemble classification, clustering, and visualization facilities [10]. In this study, different types of kernel based SVM classifiers such as SVM linear kernel, SVM polynomial (quadratic and cubic), and SVM based RS ensemble classification approach have been considered to classify various PQEs like normal, sag, swell, and distortion of harmonics with consideration of class labels such as K1, K2, K3, and K4, respectively, in off-grid mode of the MG network. In addition, other PQEs like three numbers of PQ transients (due to switching of capacitor bank and PV inverter, and LG fault) have been classified in both modes (off-grid and on grid) of the MG network with consideration of the following class labels: K5, ( The estimated energy values from the extracted features of various PQE signals have been utilised to learn the kernel based SVM learners during the first phase of classification. While learning the classifiers, a k-fold cross validation method is applied with the input data set to nullify the issue of over-fitting. The prediction capability of classifiers can be assessed with the help of the cross validation method [50]. The training data (X) is separated into equal sized chunks with a bunch of k disjoint subsets (X1, X2, . . .., Xk). From available k-subsets, one subset is utilised for testing, and the remaining subsets (k-1/k) are utilised for classifier training [50]. In this work, cross validation with 10 folds is considered while learning the classifiers. This section describes the kernel based SVM classifiers (linear, polynomial (quadratic and cubic)), and the proposed RS ensemble classifier in more detail.

SVM classifier
SVM is a more flexible machine learning algorithm for the applications of pattern recognition and classification [4]. The SVM rule algorithm was developed by Vapnik [51] and operates on the basis of supervised learning theory. SVM seeks to separate the heperplane in an optimum way by maximising the margin data set and hyperplanes [52]. It offers good generalisation accuracy on unknown data and supports the intensive optimization methods that enable SVM to learn from a large scale of data [53]. An example of the SVM concept is shown in Fig 4. For a given training data set, fx i ; y i g K iÀ 1 , where x i 2 R n is the vectors of input data, y i 2 {+1, -1} denotes different classes, and K is the number of samples. The given training data set can be separated linearly by the hyperplane f(x), as represented by Eq (7) [52][53][54] where w and b represent the terms for weights and bias used to optimise the position of the hyperplane separation. The constraints as given in Eq (8) should be satisfied to separate the hyperplane.
It is possible to estimate the distance between margin and vectors x i that lies on the incorrect side of the margin is generally outlined by the positive slack variable £ i . For separating given data, optimal hyperplane is determined by solving the optimization problem which is expressed in Eq (9): Prone to y i (w.x i + b) � 1 -£ i , and £ i � 0, let C denote the penalty for error, and by using Lagrangian multipliers α i , the problem of optimization (Eq (9)) will be transformed into a problem of dual quadratic optimization, as expressed in Eq (10) [54]: Subject to X K iÀ 1 a i y i ¼ 0; and a i � 0, The problem of dual optimization is possible to solve by using linear decision function, expressed in the Eq (11): The kernel functions of SVM are useful for solving nonlinear classification problems. By using a nonlinear function (φ), the kernel functions of SVM can be used to transform inseparable data from low-dimensional space to a higher-dimensional space where the data is separated linearly [52]. The function of non-linear decision with kernel (K) inclusive can be defined as follows: where k(x i , x j ) is the kernel function that can be written as ɸ(x i ) and ɸ(x j ), respectively. In this study, SVM classifiers with different kernel functions like linear, olynomial (quadratic and cubic), and RBF (Gaussian fine) have been used to categorise various PQEs in the MG model of power network. Furthermore, for classification of multi class PQEs in MG network that kernel based SVM classifiers have been used with adoption of the One Against One (OAO) multiclass method [55]. The classification of various PQEs in the MG network using kernel based SVM classifiers is shown in Fig 5. A 10 folds cross validation method is applied with a given input data set (400 instances (40 instances per PQE) and three features) while learning kernel based SVM classifiers (linear kernel, and polynomial kernel (Quadratic & Cubic)). In the final decision phase, predictions of class values are obtained from each classifier.

SVM linear kernel.
The linear kernel is a simple and easy to interpret kernel function. It is a fast data mining algorithm for solving multiclass classification problems. It can be used for a number of features in a large data set. The linear kernel function can be expressed as [56], where k(x i , x j ) is the kernel function, x i and x j are feature space vectors. and 'C' is the box constraint or regularization parameter. The value of regularization parameter (C) is greatly influences over the trade off between the maximisation of classification margin and minimisation of error [57]. In this study, for the linear kernel of SVM, the value of 'C' is considered as 9 on the basis of achieved higher accuracy and minimum error level. The steps of classification process with linear kernel of SVM classifier are illustrated in Table 3.

SVM polynomial kernel.
It is a global kernel with good generalization ability. Ii is useful for learning high dimensional data with nonlinear boundaries, and its kernel parameters have a substantial effect on the decision boundary. This kernel is capable of solving multi class problems Table 3. Process steps of classification: SVM linear kernel.

PLOS ONE
with allowable margin [58]. The definition of a polynomial kernel can be expressed as [56], where 'C' is the regularisation or box constraint parameter; k(x i , x j ) is the kernel function; x i and x j denotes feature space vectors; and 'd' states the degree of polynomial function. The Quadratic and Cubic kernels are the sub types of polynomial kernel functions of SVM. The quadratic kernel is a 2 nd order polynomial kernel function that can be stated as [59], The cubic kernel is a third order polynomial kernel function and it can be defined as [59,60], For polynomial kernel based SVM classifiers (quadratic and cubic), two parameters like regularisation parameters 'C' and 'd' degree of polynomial function are greatly influenced by their performance level [57]. In this work, according to the observation of higher accuracy and minimum level of mean absolute error of classification, the value of 'C' is considered as 12 for both quadratic and cubic kernel based SVM classifiers, and the value of 'd' is considered as 2 for quadratic and 3 for cubic kernel based SVM classifiers, respectively. The steps of classification process with the polynomial kernel of SVM classifier are illustrated in Table 4.

Random subspace (RS) ensemble classifier
The RS ensemble classifier can achieve the benefits by applying a random subset of features over the combined set of base classifiers.  [53]. A majority voting rule is implemented over the output predictions of weak classifiers to obtain target class labels at final stage of classification [38]. The performance and accuracy precision of weak classifiers are improved by the ensemble approach of RS technique to effectively exploit their outcome predictions. Furthermore, because the classifiers are easily trained using smaller subspaces with the RS technique, the features to instance ratio can be significantly improved [38].  Step 1: Random splitting of data set X into k times with equal size of subsets: X = (X1, X2, X3, ‥, XK); (K = 10); Step 2: For k 1 to 10; Train the classifier SVM polynomial kernel, from D or DK Step 3: Apply Kernel Function', 'Quadratic', (Eq (15)) / 'Cubic', (Eq (16)

PLOS ONE
For this RS ensemble model, the p � dimension feature subset (p � <p) is randomly chosen from a given p-dimensional data set for PQ analysis. Following that, the suitable weak classifier is learned using the subspace feature vectors. This approach is repeated M times in order to train M classifiers with a new subset of feature vectors each time. Finally, majority voting is used to evaluate the predictions of N classifiers. The RS method process steps are explained in further level in Table 5 [39,40].
In general, the output classification performance of the RS ensemble technique is determined by the two main factors, such as size of the feature subset (subspace) and the number of weak classifiers (ensemble size). For this research work, SVM based (cubic kernel) RS ensemble classification model is proposed to discriminate different PQEs in PV connected MG network with both modes of operation (on-grid and off-grid mode) of MG network. To get better performance, the sub space size of 0.5 and 10 number of weak classifiers (SVM cubic kernel) are assigned for the proposed RS ensemble model.

Description of performance factors
The definitions of important PF which are used to evaluate the effectiveness of classifiers are given below: • Precision (P): It is a ratio between of correctly predicted observations (true positives) and the sum of total predicted observations (true positives + false positives), and it can be expressed as below [36]: where T P denotes the true positive and F P denotes the false positive • Recall (R): It is a ratio between correctly predicted observations (true positives) and sum of all observations including true positives and false negatives and it can be defined as [36], • F-measure: The weighted average of precision and recall is called as F-measure and the expression of F-measure can be defined as [36], Input: Training data set, D ¼ fD 1 ; D 2 ; . . . ; D m g; ðLet D ¼ XÞ (17), Step 1: Consider each set of training sample K j has a p-dimensional feature vector, written as, K j ¼ fK j1 ; K j2 ; . . . ; K jp g; fj ¼ ð1; 2; . . . ; mÞg (18), Step 2: Randomly selects feature elements (p � < p) from p-dimensional feature vector T j , Then, (i) Training sample of original set X becomes X r , and denoted as, X r = {K 1 r , K 2 r , . . .,K m r } (19), (ii) Each training sample in X r consists of p � -dimensional feature elements, and stated as,

Results and discussion of PQ analysis
In this section, the results of PQ analysis that includes extraction of features with the DWT method and classification of various PQEs in the MG network are discussed in detail. The simulation was carried out for the proposed MG model with creation of various PQEs in both the off-grid and on-grid modes of MG network under STC and variation of solar PV irradiance with real time conditions. During simulation analysis, the simulation time of 1 sec was considered for each case of PQEs studied. The switch conditions for the creation of various PQEs along with details of time span for the occurrence of each PQE in both modes of MG network are illustrated in Table 2

Results and discussion of DWT analysis
From the DWT analysis, extracted features of energy values of various PQEs in the MG network were used to learn the proposed RS ensemble and kernel based SVM classifiers like

PLOS ONE
linear, polynomial (quadratic, and cubic). The main factors such as mother wavelet (db4), decomposition level (5 th ), and sampling frequency (24 Hz) were considered during the analysis of PQE signals with the DWT method. From the analysis of all PQE signals, wavelet coefficients of detail (d1 to d5) and approximation (a5) were obtained for further analysis. As an  From the results of DWT analysis, it can be concluded that the value of wavelet transform coefficient (d5) has a small magnitude for normal signal and its level of magnitude varies abruptly during the period of voltage sag and transient conditions. Similarly, the decomposition analysis was also carried out for other PQEs and transient signals in all three phases of the MG network. Finally, the extracted coefficients from the voltage and current signal of different PQEs were utilised to evaluate the energy values, using Eq (6).

Results of classification analysis
The extracted energy values of the input data set (400 samples) with adoption of 10-fold validation method were applied to learn the kernel based SVM learners (linear kernel, polynomial

PLOS ONE
kernels (quadratic and cubic)) and SVM based RS ensemble classifier. In this study, the common PQEs (normal, sag, swell, and distortion of harmonics) in off-grid mode of MG network and PQ transients (Transient 1, transient 2, and transients 3) in both off-grid and on-grid mode of MG network were classified by kernel based SVM learners and RS ensemble model under STC and variation of solar irradiance of PV with real time condition. During this analysis, around 400 numbers of instances (40 instances per PQE) were considered to learn the classifiers. In general, the classification accuracy (CA) of classifiers as given in Eq (26) is defined as the ratio between correctly predicted PQEs and the total number of PQEs studied. From the classification analysis, the effectiveness of the proposed RS ensemble classifier is verified (in terms of CA) with the results of individual kernel based SVM classifiers.

Results of classification with kernel based SVM classifiers.
In order to achieve effective performance of kernel based SVM classifiers, it is very important to select the appropriate value of penalty or regularisation parameter "C" while classification. Changing of value "C" can influence the performance accuracy, classification error, and margin of hyperplane in SVM. In this study, the penalty factor "C" was tuned manually on the account of getting minimum classification error and maximum accuracy of kernel classifiers. For the linear kernel SVM classifier, the tuning range of "C" value was considered between 1 and 12 (with steps of 1). The tuning of "C" value for the linear SVM classifier with respect to the classification error and accuracy is shown in Fig 13. From the results of tuning (Fig 13), it can be noticed that maximum classification accuracy (87%) and minimum error (0.158) were achieved at the "C" value of 9. Similarly, the tuning range of "C" value was considered between 1 to 14 (with steps of 1) for the polynomial kernels (Quadratic and Cubic) of SVM classifiers. From the tuning results of "C", as shown in Figs 14 and 15, maximum accuracy (91% and 94%) and minimum error (0.154 and 0.152) were achieved at the "C" value of 12 with Quadratic SVM and Cubic SVM classifiers, respectively.
(a) SVM classifiers: Classification results under STC of solar PV. In this case, the PQEs in both the on-grid and 0ff-grid mode of the MG network were classified with kernel based SVM  Table 6. The correctly and incorrectly classified instances of each PQE in the confusion matrix are represented as diagonal and off diagonal elements, respectively. Furthermore, the details of all the classified instances for all the PQEs in the MG network are given in Table 7.
From the prediction results of kernel based SVM classifiers (Tables 6 and 7), it is clear that the misclassification rate was higher with the linear kernel classifier for the PQEs (Swell, PQ transients 1 and 2 in islanded and transients 1in grid connected network) than Quadratic and Cubic SVM classifiers. For the Quadratic type, the misclassification rate was moderate between Linear and Cubic type classifiers. Most of the instances of PQEs (except Transients 1) were fully classified correctly with the Cubic type of classifier. Because of reduced misclassification

PLOS ONE
rate with the Cubic type classifier, the classification accuracy (94%) was significantly improved as compared to the Quadratic (91.3%) and Linear (87%) type of SVM classifiers. Thus, this study proves that the cubic kernel SVM classifier offers better performance with more suitability for classification of various PQEs in the MG model of power network than other types (Quadratic and Linear).
(b) SVM classifiers: Classification results under real time varying solar. In this analysis, the real time varying solar data for the PV source was considered as used in [37], while the analysis of all PQEs in both the on-grid and off-grid modes of the MG network. The real time solar data for the period of 1 s (with 10 slots of 0.1 s intervals) is shown in Fig 16. Based on the Confusion Matrix results (like STC case analysis,) the classification results of Kernel based SVM classifiers were obtained, as given in Table 8. Likewise, the results of STC analysis showed that the misclassification rate was higher with the Linear kernel SVM classifier than with Quadratic and Cubic type classifiers. Also, the misclassification rate was moderate with Quadratic and significantly reduced with the Cubic type SVM classifier under this case condition. However, the results of classification accuracy were slightly lower for all the classifiers than the results obtained with STC case analysis. In comparison to the Cubic type SVM classifier, the misclassification rate of analysed PQEs was high with the Linear and Quadratic types. Thus, the cubic kernel SVM classifier provides higher classification accuracy (91.1%) than the accuracy levels of the Quadratic (88%) and Linear (83.3%) kernel SVM classifiers. This analysis clearly shows that the Cubic kernel SVM classifier provides more promising results (more than 91%) than other types, even when the PV source's solar irradiance varies.
Among the results of accuracy, as summarized in Table 9, it can be concluded that the classification accuracy was lower with the SVM linear kernel compared to the SVM quadratic and SVM cubic kernel classifiers. Since the SVM linear mostly provides better solutions for multiclass linear problems than non-linear, the misclassification rate was high with the non-linear nature of PQ transients, whereas the polynomial SVM kernel (quadratic and cubic) provides better solutions for non-linear PQ transients with a reduced misclassification rate. Moreover, the order of polynomial functions has a significant impact on the performance of polynomial kernel classifiers. As a result, the cubic kernel SVM classifier has a higher order of polynomial function and provides higher classification accuracy under STC and varying solar conditions than the quadratic type.

PLOS ONE
Furthermore, in order to improve the generalisation ability and overall accuracy of SVM classifiers, it is proposed SVM-based RS ensemble classifier for analysis. The results of classification analysis with the RS ensemble method are discussed in detail in the following section.

Results of classification with proposed RS ensemble classifier.
In the RS ensemble method, randomly picked subsets of features (n) from the full space of the input data set (D) are used to train the N number of base classifiers, and the predictions of the base classifiers are computed by using the majority voting rule. The size of the feature subset (subspace size) and the size of the ensemble (number of base classifiers) have a significant influence over the expected converge and performance of the RS ensemble classifier [61]. Hence, it is necessary to select the appropriate value of subspace feature size and number of base classifiers (ensemble size) in the RS ensemble model. As a rule of thumb, selecting the size of the feature subset,

PLOS ONE
as n = D/2 features, can yield promising results with the RS ensemble classifier [62,63]. Therefore, in this work, feature subset size 0.5 was considered, and the optimum value of ensemble size was obtained through manual tuning. The tuning results of ensemble size with the proposed RS model, as shown in Fig 17, clearly indicates that maximum classification accuracy (99.3%) and minimum error (0.144) were achieved with ensemble size of 10 (among 1 to 12 analyses). As from the analysis results of kernel based SVM classifiers, the Cubic kernel SVM classifier was more effective and attained higher classification accuracy (94%) than other kernel types. As a result, the effective Cubic kernel SVM classifier was considered as the base classifier in the proposed RS ensemble model to achieve further improvement in its classification performance.

PLOS ONE
Tables 10 and 11, respectively. From the prediction results of the RS ensemble classifier (Tables  10 and 11), it can be noticed that all the instances of most of the PQEs (Normal, Sag, harmonics, Transients 2 and 3 with islanded, and all the transients in grid connected MG) were fully classified correctly (100%) and only one instance with PQE of Voltage Swell and two instances with Transients 1 of the islanded MG network were misclassified. Because of this higher successive classification rate, the classification accuracy (99.3%) of the RS ensemble classifier was significantly improved as compared to the Cubic kernel SVM (94%) and other kernel types (Quadratic (91.3%) and Linear (87%)). Thus, the proposed RS ensemble model is more suitable to discriminate all the PQEs in the MG network with a substantial improvement in classification accuracy compared to individual kernel based SVM classifiers.  Table 12. Likewise, the classification results obtained under STC case analysis showed that the instances of most of the PQEs were fully classified correctly (100%) and only 2 instances of Swell, 6 instances of Transients 1 of an islanded network, and 4 instances of Transients 1 of a grid connected network were misclassified, respectively. As compared to the results of kernel based SVM classifiers (under both case analysis of STC and real time solar variation), the proposed ensemble classifier still provides promising results of higher classification accuracy (97%) than kernel based SVM classifiers (Cubic (94% with STC and 91.3% with varying solar)), (Quadratic (91.3% with STC and 88% with varying solar)), and (Linear (87% with STC and 83.3% with varying solar)). As compared to the classification results of the RS ensemble classifier under STC case analysis, the classification

PLOS ONE
accuracy was slightly reduced under real time varying irradiance of solar PV. However, the proposed RS ensemble classifier still provides promising results with higher classification accuracy than individual kernel based SVM classifiers, even under uncertain conditions solar PV. From the summary of classification accuracy, as illustrated in Table 13, a significant improvement in accuracy was achieved with the RS ensemble method under STC and uncertain conditions of solar power. Thus, the RS ensemble classifier gains the advantages by utilising randomly selected subset features with the adoption of an ensemble strategy for the assigned set of SVM classifiers, which can reflect the classification of PQEs with higher classification accuracy and lower bias risk.

Performance analysis
In performance analysis, the Performance factors (PF) such as KS, Precision, Recall, F-Measure, and ROC of classifiers (proposed RS ensemble and kernel based SVM classifiers) were evaluated to verify the effectiveness of classifiers in further levels under STC and varying solar irradiance of PV with real time conditions.    Table 14. As from the results, it can be noticed that the KS value (0.967) was high and substantially improved with the RS ensemble classifier than with kernel based SVM classifiers. Furthermore, a significant improvement in Precision (0.973), Recall (0.970), and F-Measure (0.969) results were observed

PLOS ONE
promising results in PF than other kernel based SVM classifiers, even under uncertain conditions of solar PV. The proposed RS ensemble classifier can benefit (higher accuracy with superior performance) from the use of random subspaces and the application of the ensemble strategy over the predictions of assigned Cubic kernel SVM models. Thus, the results of this study prove that the proffered RS ensemble model is more effective and robust for discriminating between different PQEs in both modes of MG network under STC and uncertain conditions of solar PV.

Comparative analysis with exiting literature works and nonlinear classifiers
This section describes a comparative analysis of PQEs classification between the proposed RS ensemble method and other literature works. From the results of comparison, as illustrated in the Table 15, it is evident that the classification accuracy of different classifiers varies from the ranges of 95.30% to 100%. According to the Table 13, the research works in [14,33] were considered to study different PQEs in simple power networks without integration of RE sources, whereas the research works in [32,35,64] were discriminated different PQEs in RE integrated MG networks but failed to analyse under uncertain RE source conditions. However, in this study, different PQEs and transients due to switching and LG fault events were considered to be categorised with the proposed RS ensemble method in the PV integrated MG network under real-time varying solar irradiance of the PV system. As compared to other works, the proposed RS ensemble method is more robust in discriminating PQEs with an accuracy of 97% even under uncertain conditions of the RE source. Furthermore, comparison of accuracy between the proposed RS ensemble method and other non-linear classifiers is illustrated in

Conclusions
In this study, SVM based RS ensemble classification model is proposed to detect and discriminate the most common PQEs like sag, swell, distortion of harmonics in off-grid, and different PQ transients in both the on-grid and off-grid modes of the PV integrated MG network under the following conditions: 1) STC of solar PV; and 2) varying solar irradiance of PV with real time conditions. The effectiveness of the proffered RS ensemble model is verified with the results of kernel based individual SVM classifiers (linear, polynomial (Quadratic and Cubic)). The Matlab-Simulink software tool is used to develop and simulate the PV integrated MG network for analysis. In the pre-stage of classification, the features of energy values from the disturbance signals of various PQEs are extracted by the DWT technique. Further, the input features are used to train the RS ensemble classifier and individual kernel based SVM classifiers (Linear, Quadratic, and Cubic) to obtain targeted class labels at the final stage of classification. From the classification results, it is inferred that the proffered RS ensemble classifier offers higher accuracy of classification under STC (99.3%) and varying solar condition (97%) of PV than individual kernel based SVM classifiers (Linear (87% with STC and 83.3% with solar variation), Quadratic (91.3% with STC and 88% with solar variation), and Cubic (94% with STC and 91.3% with solar variation)). Furthermore, the effectiveness of the RS ensemble classifier is verified at a further level with performance analysis. The PF results clearly show that the proffered RS ensemble model provides more promising results in PF than individual kernel based SVM classifiers. Thus, from this study, it can be concluded that the proffered SVM based RS ensemble model is more robust and offers excellent performance for classification of different PQEs in PV connected MG network under STC and uncertain conditions of solar PV. Furthermore, classification of complex PQEs using hybrid signal processing method with advanced intelligent classifiers in the MG power network is the future scope of this work.